AITopics | loss weight

Collaborating Authors

loss weight

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Self-Critical Reasoning for Robust Visual Question Answering

Jialin Wu, Raymond Mooney

Neural Information Processing SystemsFeb-11-2026, 20:58:48 GMT

Neural Information Processing Systems http://nips.cc/

explanation, sensitivity, vqa system, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > Canada (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

e4a6222cdb5b34375400904f03d8e6a5-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 15:27:21 GMT

The split of training, validation and test sets follows the settings of the previous method[10]. The inputpointcloudconsists of2048pointsrepresented bytheirEuclidean coordinates sampled from a normalized object, and the indexes of keypoints are given. The learning rate is set to1 10 3 andhalvedevery10epochs. Wesetthetargetvarianceσ2t to4,thelossweight ofvariance regularization to1, and the loss weight of distributions regularization to0.01 to achieve the best results after tuning. Wesetthetargetvarianceσ2t to4,thelossweightofvariance regularization to1, and the loss weight of distributions regularization to0.01 to achieve the best results after tuning.

artificial intelligence, human3, machine learning, (16 more...)

Neural Information Processing Systems

Country: Oceania > Australia > New South Wales > Sydney (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.37)

Add feedback

Convergence and Sketching-Based Efficient Computation of Neural Tangent Kernel Weights in Physics-Based Loss

Hirsch, Max, Pichi, Federico

arXiv.org Artificial IntelligenceNov-20-2025

In multi-objective optimization, multiple loss terms are weighted and added together to form a single objective. These weights are chosen to properly balance the competing losses according to some meta-goal. For example, in physics-informed neural networks (PINNs), these weights are often adaptively chosen to improve the network's generalization error. A popular choice of adaptive weights is based on the neural tangent kernel (NTK) of the PINN, which describes the evolution of the network in predictor space during training. The convergence of such an adaptive weighting algorithm is not clear a priori. Moreover, these NTK-based weights would be updated frequently during training, further increasing the computational burden of the learning process. In this paper, we prove that under appropriate conditions, gradient descent enhanced with adaptive NTK-based weights is convergent in a suitable sense. We then address the problem of computational efficiency by developing a randomized algorithm inspired by a predictor-corrector approach and matrix sketching, which produces unbiased estimates of the NTK up to an arbitrarily small discretization error. Finally, we provide numerical experiments to support our theoretical findings and to show the efficacy of our randomized algorithm. Code Availability: https://github.com/maxhirsch/Efficient-NTK

approximation, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2511.1553

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > Portugal > Braga > Braga (0.04)
Europe > Italy > Friuli Venezia Giulia > Trieste Province > Trieste (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Physics-Informed Neural Network Frameworks for the Analysis of Engineering and Biological Dynamical Systems Governed by Ordinary Differential Equations

Whitman, Tyrus, Particka, Andrew, Diers, Christopher, Griffin, Ian, Wickramasinghe, Charuka, Ranaweera, Pradeep

arXiv.org Artificial IntelligenceNov-19-2025

In this study, we present and validate the predictive capability of the Physics-Informed Neural Networks (PINNs) methodology for solving a variety of engineering and biological dynamical systems governed by ordinary differential equations (ODEs). While traditional numerical methods a re effective for many ODEs, they often struggle to achieve convergence in problems involving high stiffness, shocks, irregular domains, singular perturbations, high dimensions, or boundary discontinuities. Alternatively, PINNs offer a powerful approach for handling challenging numerical scenarios. In this study, classical ODE problems are employed as controlled testbeds to systematically evaluate the accuracy, training efficiency, and generalization capability under controlled conditions of the PINNs framework. Although not a universal solution, PINNs can achieve superior results by embedding physical laws directly into the learning process. We first analyze the existence and uniqueness properties of several benchmark problems and subsequently validate the PINNs methodology on these model systems. Our results demonstrate that for complex problems to converge to correct solutions, the loss function components data loss, initial condition loss, and residual loss must be appropriately balanced through careful weighting. We further establish that systematic tuning of hyperparameters, including network depth, layer width, activation functions, learning rate, optimization algorithms, w eight initialization schemes, and collocation point sampling, plays a crucial role in achieving accurate solutions. Additionally, embedding prior knowledge and imposing hard constraints on the network architecture, without loss the generality of the ODE system, significantly enhances the predictive capability of PINNs.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2511.00043

Country:

North America > United States > New York (0.04)
North America > United States > California (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.46)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AV-Dialog: Spoken Dialogue Models with Audio-Visual Input

Chen, Tuochao, Veluri, Bandhav, Gong, Hongyu, Gollakota, Shyamnath

arXiv.org Artificial IntelligenceNov-17-2025

Dialogue models falter in noisy, multi-speaker environments, often producing irrelevant responses and awkward turn-taking. We present AV-Dialog, the first multimodal dialog framework that uses both audio and visual cues to track the target speaker, predict turn-taking, and generate coherent responses. By combining acoustic tokenization with multi-task, multi-stage training on monadic, synthetic, and real audio-visual dialogue datasets, AV-Dialog achieves robust streaming transcription, semantically grounded turn-boundary detection and accurate responses, resulting in a natural conversational flow. Experiments show that AV-Dialog outperforms audio-only models under interference, reducing transcription errors, improving turn-taking prediction, and enhancing human-rated dialogue quality. These results highlight the power of seeing as well as hearing for speaker-aware interaction, paving the way for {spoken} dialogue agents that perform {robustly} in real-world, noisy environments.

artificial intelligence, dataset, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.11124

Country:

North America > United States > Virginia (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.91)

Add feedback

Imbalance in Balance: Online Concept Balancing in Generation Models

Shi, Yukai, Ou, Jiarong, Chen, Rui, Yang, Haotian, Wang, Jiahao, Tao, Xin, Wan, Pengfei, Zhang, Di, Gai, Kun

arXiv.org Artificial IntelligenceNov-12-2025

In visual generation tasks, the responses and combinations of complex concepts often lack stability and are error-prone, which remains an under-explored area. In this paper, we attempt to explore the causal factors for poor concept responses through elaborately designed experiments. W e also design a concept-wise equalization loss function (IMBA loss) to address this issue. Our proposed method is online, eliminating the need for offline dataset processing, and requires minimal code changes.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2507.13345

Country: Europe > Latvia > Lubāna Municipality > Lubāna (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Sampling and Loss Weights in Multi-Domain Training

Salmani, Mahdi, Worah, Pratik, Razaviyayn, Meisam, Mirrokni, Vahab

arXiv.org Artificial IntelligenceNov-11-2025

In the training of large deep neural networks, there is a need for vast amounts of training data. To meet this need, data is collected from multiple domains, such as Wikipedia and GitHub. These domains are heterogeneous in both data quality and the diversity of information they provide. This raises the question of how much we should rely on each domain. Several methods have attempted to address this issue by assigning sampling weights to each data domain using heuristics or approximations. As a first step toward a deeper understanding of the role of data mixing, this work revisits the problem by studying two kinds of weights: sampling weights, which control how much each domain contributes in a batch, and loss weights, which scale the loss from each domain during training. Through a rigorous study of linear regression, we show that these two weights play complementary roles. First, they can reduce the variance of gradient estimates in iterative methods such as stochastic gradient descent (SGD). Second, they can improve generalization performance by reducing the generalization gap. We provide both theoretical and empirical support for these claims. We further study the joint dynamics of sampling weights and loss weights, examining how they can be combined to capture both contributions.

artificial intelligence, gt 2, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2511.06913

Country:

North America > United States > California (0.14)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.36)

Add feedback

Self-Critical Reasoning for Robust Visual Question Answering

Jialin Wu, Raymond Mooney

Neural Information Processing SystemsOct-9-2025, 13:39:41 GMT

Neural Information Processing Systems http://nips.cc/

explanation, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > Canada (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Quantifying constraint hierarchies in Bayesian PINNs via per-constraint Hessian decomposition

Landgren, Filip

arXiv.org Artificial IntelligenceOct-7-2025

Bayesian physics-informed neural networks (B-PINNs) merge data with governing equations to solve differential equations under uncertainty. However, interpreting uncertainty and overconfidence in B-PINNs requires care due to the poorly understood effects the physical constraints have on the network; overconfidence could reflect warranted precision, enforced by the constraints, rather than miscalibration. Motivated by the need to further clarify how individual physical constraints shape these networks, we introduce a scalable, matrix-free Laplace framework that decomposes the posterior Hessian into contributions from each constraint and provides metrics to quantify their relative influence on the loss landscape. Applied to the Van der Pol equation, our method tracks how constraints sculpt the network's geometry and shows, directly through the Hessian, how changing a single loss weight non-trivially redistributes curvature and effective dominance across the others.

artificial intelligence, constraint, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2510.03278

Country: Europe > United Kingdom (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Impact of Loss Weight and Model Complexity on Physics-Informed Neural Networks for Computational Fluid Dynamics

Chou, Yi En, Liu, Te Hsin, Lin, Chao-An

arXiv.org Artificial IntelligenceOct-1-2025

Physics Informed Neural Networks offer a mesh free framework for solving PDEs but are highly sensitive to loss weight selection. We propose two dimensional analysis based weighting schemes, one based on quantifiable terms, and another also incorporating unquantifiable terms for more balanced training. Benchmarks on heat conduction, convection diffusion, and lid driven cavity flows show that the second scheme consistently improves stability and accuracy over equal weighting. Notably, in high Peclet number convection diffusion, where traditional solvers fail, PINNs with our scheme achieve stable, accurate predictions, highlighting their robustness and generalizability in CFD problems.

artificial intelligence, machine learning, pinn, (17 more...)

arXiv.org Artificial Intelligence

2509.21393

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Taiwan (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback